4 research outputs found
VISEM-Tracking, a human spermatozoa tracking dataset
A manual assessment of sperm motility requires microscopy observation, which
is challenging due to the fast-moving spermatozoa in the field of view. To
obtain correct results, manual evaluation requires extensive training.
Therefore, computer-assisted sperm analysis (CASA) has become increasingly used
in clinics. Despite this, more data is needed to train supervised machine
learning approaches in order to improve accuracy and reliability in the
assessment of sperm motility and kinematics. In this regard, we provide a
dataset called VISEM-Tracking with 20 video recordings of 30 seconds
(comprising 29,196 frames) of wet sperm preparations with manually annotated
bounding-box coordinates and a set of sperm characteristics analyzed by experts
in the domain. In addition to the annotated data, we provide unlabeled video
clips for easy-to-use access and analysis of the data via methods such as self-
or unsupervised learning. As part of this paper, we present baseline sperm
detection performances using the YOLOv5 deep learning (DL) model trained on the
VISEM-Tracking dataset. As a result, we show that the dataset can be used to
train complex DL models to analyze spermatozoa
PolypConnect: Image inpainting for generating realistic gastrointestinal tract images with polyps
Early identification of a polyp in the lower gastrointestinal (GI) tract can
lead to prevention of life-threatening colorectal cancer. Developing
computer-aided diagnosis (CAD) systems to detect polyps can improve detection
accuracy and efficiency and save the time of the domain experts called
endoscopists. Lack of annotated data is a common challenge when building CAD
systems. Generating synthetic medical data is an active research area to
overcome the problem of having relatively few true positive cases in the
medical domain. To be able to efficiently train machine learning (ML) models,
which are the core of CAD systems, a considerable amount of data should be
used. In this respect, we propose the PolypConnect pipeline, which can convert
non-polyp images into polyp images to increase the size of training datasets
for training. We present the whole pipeline with quantitative and qualitative
evaluations involving endoscopists. The polyp segmentation model trained using
synthetic data, and real data shows a 5.1% improvement of mean intersection
over union (mIOU), compared to the model trained only using real data. The
codes of all the experiments are available on GitHub to reproduce the results.Comment: 6 page
Recommended from our members
Usefulness of Heat Map Explanations for Deep-Learning-Based Electrocardiogram Analysis.
Peer reviewed: TrueDeep neural networks are complex machine learning models that have shown promising results in analyzing high-dimensional data such as those collected from medical examinations. Such models have the potential to provide fast and accurate medical diagnoses. However, the high complexity makes deep neural networks and their predictions difficult to understand. Providing model explanations can be a way of increasing the understanding of "black box" models and building trust. In this work, we applied transfer learning to develop a deep neural network to predict sex from electrocardiograms. Using the visual explanation method Grad-CAM, heat maps were generated from the model in order to understand how it makes predictions. To evaluate the usefulness of the heat maps and determine if the heat maps identified electrocardiogram features that could be recognized to discriminate sex, medical doctors provided feedback. Based on the feedback, we concluded that, in our setting, this mode of explainable artificial intelligence does not provide meaningful information to medical doctors and is not useful in the clinic. Our results indicate that improved explanation techniques that are tailored to medical data should be developed before deep neural networks can be applied in the clinic for diagnostic purposes
Using machine learning model explanations to identify proteins related to severity of meibomian gland dysfunction
Abstract Meibomian gland dysfunction is the most common cause of dry eye disease and leads to significantly reduced quality of life and social burdens. Because meibomian gland dysfunction results in impaired function of the tear film lipid layer, studying the expression of tear proteins might increase the understanding of the etiology of the condition. Machine learning is able to detect patterns in complex data. This study applied machine learning to classify levels of meibomian gland dysfunction from tear proteins. The aim was to investigate proteomic changes between groups with different severity levels of meibomian gland dysfunction, as opposed to only separating patients with and without this condition. An established feature importance method was used to identify the most important proteins for the resulting models. Moreover, a new method that can take the uncertainty of the models into account when creating explanations was proposed. By examining the identified proteins, potential biomarkers for meibomian gland dysfunction were discovered. The overall findings are largely confirmatory, indicating that the presented machine learning approaches are promising for detecting clinically relevant proteins. While this study provides valuable insights into proteomic changes associated with varying severity levels of meibomian gland dysfunction, it should be noted that it was conducted without a healthy control group. Future research could benefit from including such a comparison to further validate and extend the findings presented here